#### Achieving Area Efficient Parallel Fir Digital Filter Structures for Symmetric Convolutions Using VLSI Implementation



Asian Journal of Engineering and Applied Technology (AJEAT) Vol.2.No.1 2014pp 18-22. available at: <u>www.goniv.com</u> Paper Received :05-03-2014 Paper Published:28-03-2014 Paper Reviewed by: 1. John Arhter 2. Hendry Goyal Editor : Prof. P.Muthukumar

### ACHIEVING AREA EFFICIENT PARALLEL FIR DIGITAL FILTER STRUCTURES FOR SYMMETRIC CONVOLUTIONS USING VLSI IMPLEMENTATION

#### Rakesh.A, Mrs. Pushpalatha.V

Dept. of Electronics and Communication, Saveetha Engineering College, Chennai,India. <u>pushpalatha@saveetha.ac.in, rakeshvimalraj@gmail.com</u>

#### ABSTRACT

\_ . . \_\_ . . \_\_ . . \_

Based on fast finite impulse response(FIR) algorithms(FFAs) this paper proposes new parallel FIR filter structures, which are beneficial to symmetric coefficients in terms of the hardware cost, under the condition that the number of taps is a multiple of two or three and four. The main aim of this project is to achieve VLSI implementation using polyphase decomposition. The two, three and four tap parallel FIR filter is derived and then it can be represented as blocks. Exchanging multipliers with adders are advantageous because adders weigh less than multipliers in terms of silicon area. Hence we can reduce the hardware complexity. The proposed parallel FIR structures exploit the inherent nature of symmetric coefficients reducing half the number of multipliers in sub filter section at the expense of additional adders in preprocessing and post processing blocks. Hence 2, 3 and 4 parallel FIR filterimplementation is simulated through MATLAB (SIMULINK TOOL) AND XILINX (SYSTEM GENERATOR).

*IndexTerms*—Digital Signal Processing (DSP),Fast Finite ImpulseResponse(FIR) Algorithm(FFAs), Matrix Laboratory (MATLAB), Polyphase Decomposition, Symmetric Convolutions, Subfilter, Symmetric Coefficients, Very Large Scale Integration(VLSI)

#### **1. INTRODUCTION**

Nowadays due to technological advancement, the demand for low-power and high-performance digital signal processing (DSP) is higher. A Finite Impulse response (FIR) filter is a filter whose response is of finite duration, because it settles to zero in finite time. Finite Impulse Response (FIR) digital filters are one of the most widely used basic devices in DSP systems for example video and image processing. Some applications need the FIR filter to operate at high frequencies such as video processing, whereas some other applications need high throughput with a low power circuit such as multiple-input multiple-output (MIMO) systems used in cellular telephony. When narrow transition- band characteristics are required, the much higher order in the FIR filter is unavoidable.On the other hand, parallel and pipelining are two techniques used in DSP applications which can be exploited to reduce the power consumption. Pipelining shortens the critical approach by interleaving pipelining latches along the data path, at the price of increasing the latches and system latency, whereas parallel processing increase the sampling rate by replicating hardware so that multiple inputs can be processed in parallel and multiple outputs are generated at the same time, at the expense of increasedarea. Both techniques reduce the power consumption by lowering the power supply, where the sampling speed does not increase. While the continuous trends in reducing chip area and integrate multi-chip solutions into а singlechipsolution, it is important to limit the silicon area required to implement parallel FIR digital filter in VLSI implementation.[ref 1997].In many design situations the hardware overhead incurred by parallel processing cannot be tolerated due to limitations in design area. Therefore, it is advantageous to realize parallel FIR filtering structures that consume less area than traditional parallel FIR filtering structures.[ref 1997].Due to its linear increase in the hardware implementation cost brought by the increase of the block size L, the parallel processing techniques loses its advantage in practical application. Therefore, our goal is to take a comprehensive look at all of the aspects from filter design to implementation to produce low area parallel FIR filter structures. Polyphase decomposition is mainly manipulated, where the small-sized parallel FIR filter structures are derived first and then the larger block-sized ones can be constructed by cascading or iterating small sized parallel FIR filtering blocks.

#### 2. FASR FIR ALGORITHM (FFA)

Consider an N-tap FIR filter which can be expressed in the general form as

$$y[n] = \sum_{I=0}^{N-1} h(i) x(n-i), \qquad n$$
  
= 0,1,2 ... \infty (1)

Where  $\{x(n)\}\$  is an infinite-length input sequence and  $\{h(i)\}\$  are the length-N FIR filter coefficients. Then the traditional L-parallel FIR filter can be derived using polyphase decomposition as

$$\sum_{p=0}^{L-1} Y_{p}(Z)^{L}(Z)^{-p} = \sum_{q=0}^{L-1} X_{q}(Z)^{L}(Z)^{-q} + \sum_{r=0}^{L-1} H_{r}(Z)^{L}(Z)^{-r} \quad (2)$$
  
re  $X_{q} = \sum_{K=0}^{\infty} Z^{-K} X(LK+q), \qquad H_{r} =$ 

Where

 $\sum_{K=0}^{\binom{N}{L}-1} z^{-K} x (LK+r),$ 

 $Y_p = \sum_{K=0}^{\infty} z^{-K} x(LK+p)$ , for p,q,r=0,1,2...L-1.From this FIR filtering equation, it shows that the traditional FIR filter will require L<sup>2</sup> – FIR subfilter blocks of length N/L for implementation.

#### EXISTING SYSTEM

A. 2×2 FFA (L=2)

According to (2),a two-parallel FIR filter can be expressed as

$$Y_0 + Z^{-1}Y_1 = (H_0 + Z^{-1}H_1)(X_0 + Z^{-1}X_1)$$
  
=  $H_0X_0 + Z^{-1}(H_0X_1 + H_1X_0) + Z^{-2}H_1X_1$   
(3)

Implying that

$$Y_0 = H_0 X_0 + Z^{-2} H_1 X_1,$$
  

$$Y_1 = H_0 X_1 + H_1 X_0.$$

(4)

Equation (4)



Fig.1. Two-parallel FIR filter implementation using FFA.

Equation(4)shows the traditional two-parallel filter structure, which will require four length-N/2 subfilter blocks, two postprocessing adders, and totally 2N multipliers and 2N-2 adders. However (4) can be written as

$$Y_0 = H_0 X_0 + Z^{-2} H_1 X_1,$$
  

$$Y_1 = (H_0 + H_1)(X_0 + X_1) - H_0 X_0 - H_1 X_1$$
(5)

The implementation of (5) will require three FIR subfilter blocks of length N/2, one preprocessing and three postprocessing adders, and 3N/2 multipliers and 3(N/2-1)+4 adders, which reduces approximately one fourth over the traditional two-parallel filter hardware cost from (4). The two parallel (L=2) FIR filter implementation using FFA obtained from (5) is shown in Fig.1.



#### Fig.2.Three-parallel FIR filter implementation.

#### B. 3×3 FFA (L=3)

By the similar approach, a three-parallel FIR filter using FFA can be expressed as

$$Y_0 = H_0 X_0 - Z^{-3} H_2 X_2 - Z^{-3} [(H_1 + H_2)(X_1 + X_2) - H_1 X_1]$$

$$\begin{split} Y_1 &= [(H_0 + H_1)(X_0 + X_1) - H_1X_1] - (H_0X_0 \\ &- Z^{-3}H_2X_2) \\ Y_2 &= [(H_0 + H_1 + H_2)(X_0 + X_1 + X_2)] - \\ &[H0 + H1)(X0 + X1 - H1X1 - [(H1 + H2)(X1 + X2) - \\ &H_1X_1] \quad (6) \end{split}$$

The hardware implementation of (6) requires six length-N/3 FIRsubfilter blocks, three preprocessing and seven postprocessing adders, and three N multipliers and 2N+4 adders, which has reduced approximately one third over the traditional threeparallel filter hardware cost. The implementation obtained from (6) is shown in Fig.2.

goniv Publications

C.  $4 \times 4$  FFA (L=4) By the similar approach, a four parallel FIR filter using FFA can be expressed as  $Y_0 + Y_1Z^{-1} + Y_2Z^{-2} + Y_3Z^{-3} = H_0X_0 + Z^{-1}(H_0X_1 + H_1X_0) + Z^{-2}(H_0X_2 + H_1X_1 + H_2X_0) + Z^{-3}(H_1X_2 + H_0X_3 + H_3X_0 + H_2X_1) + Z^{-4}(H_1X_3 + H_2X_2 + H_3X_1) + Z^{-5}(H_2X_3 + H_3X_2) + Z^{-6}(H_3X_3)$ (7)

## **3.PROPOSED FFA STRUCTURES FOR SYMMETRIC CONVOLUTIONS.**

By utilizing the symmetry of coefficients properly, the main idea behind the proposed structures is intuitive, to manipulate the polyphase decomposition to earn as many subfilter blocks as possible which contain symmetric coefficients so that half the number of multiplications in the single subfilter block can be reused for the multiplications of whole taps.



## Fig.3. Proposed two-parallel FIR filter implementation.

Therefore, for an N-tap L-parallel FIR filter the total amount of saved multipliers would be the number of subfilter blocks that contain symmetric coefficients times half the number of multiplications in a single subfilter block(N/2L).

A.  $2 \times 2$  Proposed FFA (L=2)

From (4), a two-parallel FIR filter can also be written as

$$Y_{0} = \{1/2[(H_{0} + H_{1})(X_{0} + X_{1}) + (H_{0} - H_{1})(X_{0} - X_{1})] - H_{1}X_{1}\} + Z^{-2}H_{1}X_{1}$$

$$Y_1 = \{1/2 [(H_0 + H_1)(X_0 + X_1) - (H_0 - H_1)(X_0 - X1(8))]$$

When it comes to a set of even symmetric coefficients, (7) can earn one more subfilter block containing symmetric coefficients than (5), the existing FFA parallel FIR filter.Fig.3 shows implementation of the proposed two-parallel FIR

filter based on (8). The clear perspective of subfilter blocks are analysed using below equation as

$$h[n] = h[M - 1 - n, n = 0, 1, 2, \dots, N - 1$$
(9)

B. 3×3 Proposed FFA (L=3)

Using similar approach (6), a three parallel FIR filter can also be written as (10).Fig. 4 shows implementation of the proposed three-parallel FIR filter. When the number of symmetric coefficients N is multiple of 3, the proposed three-parallel FIR filter structure presented in (9) enables four subfilter blocks with symmetric coefficients in total,whereas the existing FFA parallel fir filter structure has only two ones out of six subfilter blocks. The three parallel FIR filter can also be written as

$$\begin{split} Y_0 &= 1/2[(H_0 + H_1)(X_0 + X_1) + (H_0 - H_1)(X_0 - X_1 - H_1X_1 + Z_2 - \mathcal{X}_1(H_0 + H_1 + H_2)(X_0 + X_1) + X_2) - H_0 + H_2)(X_0 + X_2) - \\ 1/2[(H_0 + H_1)(X_0 + X_1) - (H_0 - H_1)(X_0 - X_1)] - H_1X_1 \rbrace \end{split}$$

 $\begin{array}{l} Y_1 = 1/2[(H_0 + H_1)(X_0 + X_1) - (H_0 - H_1)(X0 - X1 + Z - \Im\{\frac{1}{2}H0 + H2(X0 + X2 + H0 - H2(X0 - X2]] - 1/2[H0 + H1)(X0 + X1 + H0 - H1)(X0 - X1 + H1X1) \end{array}$ 

$$Y_{2} = \{1/2[(H_{0} + H_{2})(X_{0} + X_{2}) - (H_{0} - H_{2})(X_{0} - X_{2} + H_{1}X_{1} + H_{2})(X_{0} - H_{$$

Therefore, for an N-tap three parallel FIR filter, the proposed structure can save N/3 multipliers from the existing FFA structure. However, again, the proposed three-parallel FIR structure also brings an overhead of seven additional adders in preprocessing and postprocessing blocks.



# 4. SIMULINK AND SYSTEM GENERATOR DETAILS

The simpleimplementation of two parallel FIR filter structures are given below as example as



Fig.4.Two parallel FIR filter implementation in SIMUINK

The device utilization of two parallel digital FIR filter is given below as



Fig.5.Device utilization of two parallel digital FIR filter.

After efficient derivation from equation (10), the SIMULINK block diagram is implemented. The SIMULINK representation of four parallel FIR filter structure is given below as



## Fig.6. SIMULINK representation of four parallel FIR filter.

After implemented in SIMULINK, we will get the required the language(VHDL,VERILOG) netlist files from SYSTEM GENERATOR tool in XILINX.

When an L-parallel FIR filter comes with a set of symmetric coefficients of length N, the number of required multipliers for the proposed parallel FIR filter structures are provided by (11) and (12).

Case I:

When N /  $\prod_{i=1}^{r}$  Li is even,

$$M = N / \prod_{i=1}^{r} \text{Li} \left( \prod_{i=1}^{r} \text{Mi} - S/2 \right)$$
(11)  
Case II:

When  $N/\prod_{i=1}^{r} Li$  is odd,

$$M = N / \prod_{i=1}^{r} \text{Li} \left( \prod_{i=1}^{r} Mi - S/2 \right) \left( \frac{N}{\prod_{i=1}^{r} Li} - 1 \right)$$
(12)

The number of the required adders in subfilter section can be given by

Asub= 
$$\prod_{i=1}^{r} \operatorname{Mi}(\frac{N}{\prod_{i=1}^{r} \operatorname{Li}} - 1)$$
(13)

#### 5. FUTURE WORK AND CONCLUSION

In this paper we have presented new paralle FIR filter structures, which are beneficial to symmetric convolutions when the number of taps is the multiple of 2,3 and 4 using SIMULINK and SYSTEM GENERATOR tool. For future work, by using efficient derivation for 5 and 6 tap parallel FIR filter structures .Overall, in this paper, we have provided new simulated structure for multiple tap of parallel FIR filter structures consisting of advantageous polyphase decompositions dealing with symmetric convolutions.

#### REFERENCES

- [1] Yu-Chi Tsao and Ken Choi, "Area efficient parallel FIR digital filter structures for symmetric convolutions based on fast FIR algorithm"IEEE Trans.VLSI Integration, vol. 20, no.2,Feb,2012.
- [2] D.A.Parker and K. K. Parhi, "Low-area/power parallel FIR digital filter implementations," J. VLSI Signal Process, Syst., vol. 17, no.1, pp. 75-92, 1997.
- [3] J.G.Chung and K. K .Parhi, "Frequencyspectrum-based low-area low-power parallel FIR filter design," EURASHIP J, Appl, Signal Process., vol. 2002,no.9,pp. 444-453,2002.
- [4] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and implementation. New York: Wiley, 1999.
- [5] Z-J.Mou and P.Duhamel,"Short-length FIR filters and their use in fast nonrecursive filtering," IEEE Trans.Signal Process., vol. 36, no. 6, pp. 1322-1332, Jun. 1991.
- [6] C.Cheng and K. K. Parhi,"Hardware efficient fast parallel FIR filter structuresbased on iterated short convolutions," IEEE Trans, Circuits Syst, 1, Reg. Papers, vol. 51,no. 8,pp. 1492-1500, Aug, 2004.
- [7] C.Cheng and K. K. Parhi, "Further complexity reduction of parallel FIR filter," in Proc,IEEE Int. Symp. Circuits Syst. (ISCAS 2005), Kobe,Japan,May 2005.

- [8] C.Cheng and K. K. Parhi, "Low-cost parallel FIR structures with 2-stage parallelism," IEEE Trans, Circuits Syst, 1, Reg. Papers, vol. 54, no. 2, pp. 280-290, Feb. 2007.
- [9] I.S.Lin and S. K. Mitra, "Overlapped block digital filtering," IEEE Trans, Circuits Syst. II, Analog Digit. Signal Process., vol. 43, no. 8, pp. 586-596, Aug. 1996.